Updated parsing of problem.yaml by using checks as described in the specification#295
Updated parsing of problem.yaml by using checks as described in the specification#295Zazmuz wants to merge 7 commits intoKattis:developfrom
Conversation
…m.yaml. Co-authored-by: square-cylinder <hugo.soderbergh007@gmail.com> Cleaned up and added more checks for problem.yaml Co-authored-by: square-cylinder <hugo.soderbergh007@gmail.com> detect unknown fields f-strings compatible with older python versions fix bug
made it more obvious if old folder name is used, or folder is misspelled made check error messages clearer for people newer to problemtools closest word functionality bugfix index error
readded comment stating weird it is Name change and pytest fix
…roduced in python 3.12
Merged conflict in hello test
gkreitz
left a comment
There was a problem hiding this comment.
I started looking at this PR, and I think it needs relatively large amount of work. I don't understand several of the approach changes from #294 at all (not trying to merge the data into a common structure, putting all code in verifyproblem instead of a separate module, not having any test cases).
I ran out of time while reviewing this, but setting this as request changes to not delay giving you the feedback I had for the part I had time to review.
| self._problem.getProblemPart(InputValidators).validate(self) | ||
| anssize = os.path.getsize(self.ansfile) / 1024.0 / 1024.0 | ||
| outputlim = self._problem.get(ProblemConfig)['limits']['output'] | ||
| outputlim = self._problem.get(ProblemConfigLegacy)['limits']['output'] |
There was a problem hiding this comment.
Changing ProblemConfig to ProblemConfigLegacy like this feels dangerous, and makes a reader worry about how this works when reading in a new problem config. I'd advice at the very least signaling some intent here by accessing via ProblemConfigBase for fields that are the same for both formats.
A better solution would be to ensure the config as read by the rest of the program looks the same (has the same keys, and the same types) regardless of what version was used. I believe you did this in #294.
| # TODO: Decide if these should stay | ||
| # Some deprecated properties are inherited from problem config during a transition period | ||
| problem_grading = problem.get(ProblemConfig)['grading'] | ||
| problem_grading = problem.get(ProblemConfigLegacy)['grading'] |
There was a problem hiding this comment.
How does this work with a new config file, which lacks the grading key? (This question applies to many places where you access keys which are named differently, or have different value types in the two versions.)
| self.error(f'License needs to be one of: {self._VALID_LICENSES}') | ||
| if self._data['license'] == 'unknown': | ||
| self.warning("License is 'unknown'") | ||
| def fix_config_structure(self): |
There was a problem hiding this comment.
It looks to me like ProblemConfigBase is an abstract base class, and that this is an abstract method? If so, consider inheriting from ABC and use the decorator @abstractmethod here.
|
|
||
|
|
||
| def check(self, context: Context) -> bool: | ||
| if self._check_res is True: |
There was a problem hiding this comment.
Why did you rewrite the memoization of _check_res this way? It looks like the memoization is now broken.
| return self._check_res | ||
| elif self._check_res is not False: | ||
| self._check_res = True | ||
| to_check = [prop for prop in dir(self) if prop.startswith('_check_') and callable(getattr(self, prop))] |
There was a problem hiding this comment.
I'm not a fan of trying to find all methods with a special name and running them like this. Why was this approach taken? Do you end up with so many check methods that you can't cleanly list them?
| self._check_res = True | ||
| to_check = [prop for prop in dir(self) if prop.startswith('_check_') and callable(getattr(self, prop))] | ||
| for prop in to_check: | ||
| self.info(f'Checking "{prop}"') |
There was a problem hiding this comment.
Internal methods called should not be logged at info level.
| return best | ||
|
|
||
| class ProblemConfigLegacy(ProblemConfigBase): | ||
| DEFAULT_LIMITS = { |
There was a problem hiding this comment.
These defaults used to be specified in a problem.yaml config file (and user-configurable, as is documented in README.md). Does the fact that you list the defaults here imply that you dropped support for having the defaults configurable via config file (and user-configurable via a config file in the home directory)? If so, it feels like a big step backwards to hardcode these values rather than have them in a config file.
|
|
||
| def fix_config_structure(self): | ||
| self._data.setdefault('problem_format_version', formatversion.VERSION_LEGACY) | ||
| if self._data.get('problem_format_version') != formatversion.VERSION_LEGACY: |
There was a problem hiding this comment.
These checks (checking that something is one of a few allowed values) get very repetitive. It looks like you ened a helper function, something like
check_allowed_values(self, key: str, values: list, error_fn = self.error)
|
|
||
| if 'name' in self._data: | ||
| val = self._data['name'] | ||
| if type(val) is not str: |
There was a problem hiding this comment.
Similarly, here, you should have a check_type() function to make the code less repetitive.
Alternatively, both such check functions can probably be implemented using json-schemas.
Added checks for legacy and 2023-07 for parsing problem.yaml, making sure its correct and defaulting values.
Made sure to keep the code itself simple so that eventual changes and upkeep to it will be easy to go through with.
It has been tested on most PO problems from the last 5 years as well as islandic problems.
For the new format it has not been tested as much (due to lack of available configs), only a few edgecases and few self made configs.
If there is anything that looks weird or any questions I will be available.